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Abstract 

Classical estimation techniques for linear models either are inconsistent, or perform rather poorly, under te- 
stable error densities; most of them are not even rate-optimal. In this paper, we propose an original 
one-step R-estimation method and investigate its asymptotic performances under stable densities. Contrary 
to traditional least squares, the proposed R-estimators remain root-n consistent (the optimal rate) under 
the whole family of stable distributions, irrespective of their asymmetry and tail index. While parametric 
stable-likelihood estimation, due to the absence of a closed form for stable densities, is quite cumbersome, 
our method allows us to construct estimators reaching the parametric efficiency bounds associated with any 
prescribed values {a$, bo) of the tail index a and skewness parameter b, while preserving root-n consistency 
under any (a, b) as well as under usual light-tailed densities. The method furthermore avoids all forms of 
multidimensional argmin computation. Simulations confirm its excellent finite-sample performances. 

Keywords: Stable distributions, local asymptotic normality, LAD estimation, R-estimation, asymptotic 
relative efficiency. 



1. Introduction. 

Evidence of heavy-tailed behavior and infinite variances in economics and, even more so, in finance and 
insurance, is overwhelming. In such context, the Gauss-Markov theorem for linear regression 1 no longer 
holds true, and the usual OLS estimators of regression coefficients lose their theoretical justifications. Much 
worse: they also lose their traditional 2 root-n consistency rates. OLS estimators under stable errors thus are 
not even rate-optimal: Proposition 3.1 in Hallin, Swan, Verdebout and Veredas (2010) indeed establishes the 
local asymptotic normality, with root-n consistency rates, of linear models with stable errors, irrespective 
of their tail index and skewness parameter. 

This disturbing fact is by no means a new finding: sec Wise (1966) or Blattbcrg and Sargent (1971) for 
early discussion. Since then, the asymptotic behavior of estimators in linear models with infinite variance 
and, more specifically, in models with (non Gaussian) stable errors, has attracted much interest, and several 
alternatives to OLS estimation have been proposed. Those alternative estimators, however, either suffer 
from major consistency problems, or are strictly inefficient and can be improved: see Section 1.1 for a brief 
review. The objective of this paper is to show how one-step R-estimation allows for a tractable and quite 
substantial rate-optimal improvement. 

1.1. Regression parameter estimation under stable errors. 

Before turning to R-estimation methods, let us briefly explain why classical estimation methods fail to 
provide fully satisfactory solutions. 



1 Recall that the Gauss-Markov theorem establishes, for errors with finite variance, that OLS estimators are best linear 
unbiased estimators. 

2 Under the classical condition that the regression constants satisfy Assumption (Al) below — an assumption we tacitly make 
throughout this section. 
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(a) OLS estimators. As already mentioned, the main trouble with OLS estimators is that their consistency 
rate depends on the tail index a. This follows from the general results by Samorodnitsky et al. (2007) 
on a class of linear unbiased estimators (see point (c) below). That rate is strictly less than the optimal 
root-n rate, which is a severe drawback. Moreover, the related asymptotic confidence regions and Wald 
tests cannot be constructed without estimating a itself. 

(b) Stable MLEs. OLS estimators are the maximum likelihood estimators (MLEs) associated with Gaus- 
sian likelihoods; better performances can be expected from stable likelihoods (involving the four pa- 
rameters of stable densities along with the regression coefficients of interest). A pioneering result by 
DuMouchel (1973), indeed, shows that, somewhat surprisingly, stable MLEs (for location, scale, the 
tail index a, and the skewness parameter b) yield a very standard asymptotically normal behavior, 
with traditional root-n rates. This result easily extends to the regression case 3 . Practical implemen- 
tation, of course, runs into the problem that non Gaussian stable densities, hence stable likelihoods, 
cannot be expressed in closed form. For specified tail index a and skewness parameter &, this is 
not an obstacle anymore thanks to the computationally efficient integral approximations obtained by 
Zolotarcv (1986, 1995), Nolan (1997, 1999) and several others. But in practice, the tail index and the 
skewness parameter also have to be estimated; the information matrix, moreover, is not block-diagonal 
(see DuMouchel (1975)), so that the estimators a and b of a and b cannot simply be plugged into 
the information matrix when confidence regions or Wald tests are to be constructed for the regression 
parameters. Although asymptotically optimal, stable-likelihood-based inference in practice thus seems 
difficult. 

(c) Linear unbiased estimators. A broad class of linear unbiased estimators, of which OLS estimators are a 
particular case, has been considered by Samorodnitsky et al. (2007), who also provide a quite complete 
and systematic picture 4 of their asymptotic behavior. Consistency rates, as a rule, crucially depend on 
the tail index a of the underlying noise, and are strictly less than the optimal root-n ones; asymptotic 
covariances depend on a as well. All the drawbacks of OLS estimation thus also are present here. The 
BLUaN (best linear unbiased estimator, relative to some adequate a-norm — limited to 1 < a < 2) 
estimators considered in El Barmi and Nelson (1997) suffers from the same problems. 

(d) LAD estimators. The bad performances of L2 estimators (OLS) considerably reinforce the attrac- 
tiveness of the Li approach. The so-called LAD (Least Absolute Deviations) estimators (a particular 
case of more general quantile regression estimators in the Bassett and Koenker (1978) style) indeed, 
irrespective of the tail index a, achieve (under Assumption (Al)) root-n consistency. The asymp- 
totic properties of LAD estimators in regression models have been studied intensively: see Bassett 
and Koenker (1978) for the standard case, Knight (1998) or El Bantli and Hallin (1999) for more 
general results. Contrary to stable MLEs, BLUEs and OLS estimators, the LAD ones, thus, achieve 
rate-optimal consistency. Constructing the related confidence regions and Wald tests is possible via 
classical techniques, without any estimation of a. These advantages of LAD estimation in the sta- 
ble context were emphasized as early as 1971 by Fama and Roll (1971). On the other hand, LAD 
estimators, which are optimal under light-tailed double-exponential noise, cannot be efficient under 
any heavy-tailed stable density. The objective of this paper is to show how LAD estimators can be 
improved, often quite substantially, without specifying or estimating the tail index a. 

1.2. R- estimation under stable errors. 

Estimation methods based on ranks — in short, R-estimation — go back to Hodges and Lehmann (1963), 
who provide R-estimators for one-sample and two-sample location models (under symmetric distributions, 



3 The situation is quite different for autoregressive and ARM A models (local experiments are no longer of the LAN type), 
with n 1 /" consistency rates under tail index a, and convergence in distribution to the maximizer of a random function; see 
Andrews et al. (2009) for recent results in that context. 

4 Under very general assumptions on the asymptotic behavior of the regression constants (more general than Assump- 
tions (Al) and (A2) below), but assuming symmetric heavy-tailed errors — an assumption we do not make here. 
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for the one-sample case), based on the Wilcoxon and van der Waerden (signed) rank statistic. Since then, 
the technique has been used in a variety of problems, including _ftT-samplc location, regression and analysis of 
variance, time series analysis and elliptical families — see, e.g., Lehmann (1963), Sen (1966), Jureckova (1971), 
Koul (1971), Jureckova and Sen (1996), Koul and Saleh (1993), Allal et al. (2001), Koul (2002), Hallin et 
al. (2006), Hallin and Paindaveine (2008), and many others. 

Ranks naturally appear as maximal invariants in semiparametric models where the density / of some 
unobservable noise constitutes the infinite-dimensional nuisance. Under classical Argmin form, the Hodges- 
Lehmann or R-estimator of a parameter i? is defined as 

Q(«)( R («)( t ))|, (1.1) 

where QC n )(R(")(0o)) 

is a (signed)-rank test statistic for the null hypothesis Ho : 1? = <?o (two-sided test). 

The main advantage of over more usual M-estimators follows from the fact that (under parameter 

value i? and error density /, and standard root-rt consistency conditions), n 1 / 2 ( t?^^ — i?) is asymptotically 
equivalent to a function which depends on the unknown actual density / but is measurable with respect to 
the ranks R/ n ) (#) of the unobservable noise (see Hallin and Paindaveine 2008 for details) . The asymptotic 
relative efficiencies (AREs) of the R-estimator ■& ^ L defined in (1.1) with respect to other R-estimators, or 
with respect to its Gaussian competitor (OLS or Gaussian MLE, whenever the latter are root-n consistent) 
are the same as the AREs of the corresponding rank tests with respect to their Gaussian competitors. 5 

The Argmin form (1.1), however, is computationally inconvenient-particularly so in the case of a rela- 
tively high-dimensional parameter fl. Inspired by Le Cam's one-step estimation method, Hallin et al. (2006), 
in the context of R-estimation of shape matrices in elliptical families and Hallin and Paindaveine (2008, un- 
published manuscript), in a more general context, therefore introduced a one-step form of R-cstimation. 
That method, contrary to (1-1), avoids the computational inconvenience of minimizing, over a possibly 
high-dimensional parameter space, a piecewise constant function of the form | Q( ra )(R(")(t))|; moreover it 
also provides, as a by-product, the asymptotic covariance matrix of the R-cstimator. On the other hand, 
one-step methods require the existence of a preliminary rate-optimal consistent (here, root-n consistent) 
estimator. This role will be played, in the present context, by the LAD estimator, the only one in the 
existing literature enjoying the required consistency properties. Our R-estimators thus appear as a one-step 
improvements over the LAD estimators; they yield the same collection of ARE values as the corresponding 
rank-based tests, the values of which were obtained in Hallin et al. (2010). 

In this paper, we explain how that one-step method can be implemented for the estimation of the regres- 
sion parameter of a general linear model with stable errors, and we study the asymptotic performances of the 
resulting R-cstimators . Those R-estimators rely on a rank-based version of Le Cam's one-step methodology 
which bypasses the nonparamctric estimation of cross-information quantities. They are asymptotically nor- 
mal under any stable density (with standard root-n rate), and efficient at some prespecificd stable density /g. 
They exhibit the same asymptotic relative efficiencies as the rank-based tests studied in Hallin et al. (2010). 
For specific scores, they outperform LAD estimators, and hence all valid and tractable estimation methods 
proposed in the literature. In particular, when based on certain stable scores, such as the score associated 
with the symmetric stable distribution with tail parameter a = 1.4 (see Figure 2), they dominate the LAD 
under any stable distribution with a £E (1, 2). The computational advantages of one-step R-estimators over 
the more classical Argmin ones lie in the fact that the if -dimensional minimization (1.1) of a non convex 
piecewise constant rank-based objective function is replaced by the minimization of a continuous, strictly 
convex Li criterion (yielding the preliminary LAD estimator), followed by a one-dimensional optimization 
problem; the LAD estimator, moreover, can be obtained exactly as the solution of a linear programming 



5 Since Gaussian methods are generally invalid under stable error densities, AREs in the sequel are taken with respect to 
double-exponential likelihood procedures, that is, least absolute deviation (LAD) estimators and the regression version of sign 
tests (the Laplace rank tests). 



t hl : = argmin teRX 
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problem. Tabic 3 below provides numerical evidence of the quite substantial advantages (in terms of bias 
and mean squared error) of one-step R-estimation over its classical Argmin counterpart. 



2. R-estimation of regression coefficients. 



2.1. Asymptotics for linear models with stable errors. 

The family of a-stable densities is a four-parameter family 



{fe = fa.b.-rM := (a, b, y,5)' e 6 = (0,2] x [-1,1] x K+x 




(2.2) 



which characterizes the roles of S and 7 as location and scale parameters, respectively and that of f a ^ as 
the standardized version of f a ,b,j,S- The parameters a and b determine the shape of the distribution, with a 
being the characteristic exponent (or tail index) and b the skewness parameter — an interpretation justified 
by the fact that, for 6 = 0, f a ,b,f,s is symmetric with respect to 6 and, for < b < 1 (rcsp., — 1 < b < 0), 
skewed to the right (resp., to the left) — see Section 1.2 of Samorodnitsky and Taqqu (1994) for details. The 
notations Fg and F a ^~ h s will be used for the distribution function associated with fg. 

Some particular choices of 6 yield well-known distributions, namely the Gaussian (a = 2, any 6), the 
Cauchy (a = 1, b = 0) and the Levy (a = 1/2, b = 1). However, together with the reflected Levy density, 
these arc the only instances of stable densities that can be expressed explicitly in terms of elementary 
functions. For all other choices of the parameters, a closed form for fg is not possible, and stable distributions 
cither are defined in terms of characteristic functions and inverse Fourier transforms, or via integral formulas 
(see e.g. Nolan (1997) or Zolotarev (1986)). 

Throughout, we consider a vector X^™) := (x[ n \ . . . , X„ )' of observations satisfying 



for some intercept a € E and the regression parameters /3 := (f3i, . . . , G R k ; c-" , ...,c-^ (i = 
1, . . . ,n) are regression constants, and {ej , i £ M} is a sequence of nonobservable i.i.d. errors with stable 
density fg, = (a, b, 7,0) E 6. 

The construction of our R-estimators is based on the uniform local asymptotic normality (ULAN) prop- 
erty, with respect to /?, of the regression model (2.3) under stable error densities. That property is established 
in Hallin et al. (2010) under the following technical assumptions. Without loss of generality, we impose that 



J2ti c ik = for fc = 1,...,A'; letting c| n) := . . . , CW := n' 1 £^ =1 c^cK, we make the 



following assumptions on the asymptotic behavior of the regression constants. 

Assumption (Al) For all n £ N, is positive definite and converges, as n — > 00, to a positive definite 
matrix K -2 . 

Assumption (A2) (Noether conditions) For all k = 1, . . . , K, one has 



A" 




(2.3) 



fc=i 




Denoting by Pgl a the probability distribution of X(™) under (2.3), let 



A" 
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stand for the residuals associated with the value /? of the regression parameter: under Pg™^ a, the z\ n \pys 

thus are i.i.d. with density /( Q ,b, 7 .o)- Here and in the sequel, we write z\ {fi) instead of z\ (a,P) for 
the sake of simplicity. Although the quantity appearing in Proposition 2.1 depends on a, the rank-based 
statistics A defined in (2.5) below do not, as the Z^ n \p)'s only enter the definition through their ranks, 
which do not depend on a (fortunately so, as a remains an unspecified nuisance). The following result is 
proved in Hallin et al. (2010). 

Proposition 2.1 (ULAN, Hallin, Swan, Verdebout and Veredas 2010). Suppose that Assumptions (Al) 
and (A2) hold. Fix 9 = (a, 6, 7,0) G 6. Then, model (2.3) (the family {P g n) a g\ P G R K }), is ULAN with 

respect to P, with contiguity rate n 1 ! 2 . More precisely, letting v(n) := n~i'K.^ with IK*-™-* := (C^) ^ , for 
all P 6 M. K , all sequences P^ n ' such that i/ _1 (n)(/?'"' ) — P) = 0(1) and all bounded sequences G R^, 



A (,l) — loff dP v ' l > /dP yn > 

- log \Uti h{Zt\P + v(n)T^))/nti Mzl n \P)) 



)(«) /AX)( n ) 



rW'A^W) - \t^'t^1{6) + o P (l) 



under H ( g n \p) as n — > oo 7 where, setting tp$ := — fe/fe, with f$ the derivative of x ^ fe{x) and 

/oo 
p 2 (x)f g (x)dx, 
-oo 



I(6)\k is the information matrix and 



(n) 

VH) •— "« ■ ^' ' 

i=l 

the central sequence. 



(P) :=n-V2 K M'^(zf^))c^ A Af( 0) X(0)I K ) (2.4) 



ULAN, here as in Hallin et al. (2010), is stated under stable distributions, but of course is well known 
to hold under any density / such that f x l 2 is differentiable in quadratic mean; P^., ipg and 1(9) then 
arc to be replaced with P^j p, iff := 2D(f 1 / 2 ) / f 1 / 2 and If, where D(f 1 / 2 ) stands for the quadratic mean 

derivative of f 1 ! 2 and If := J^° oo ip 2 (x)f(x)dx. Denote by T that class of densities and by Ay ifi) the 
corresponding central sequences. 

2.2. One step R- estimators. 

The vector RW = RW (fi) := (R^ , . . . , R { n } ), where = R^ifi) denotes the rank of the residual 
= z[ n \p), i = 1, ... ,n, among z[ n \ . . . ,Z^\ is distribution- free as / and a range over the class of all 
nonvanishing densities and R, respectively. Throughout, we consider the class of rank-based statistics 

' d(») * 



where J : (0, 1) — > R is some score generating function satisfying 

Assumption (B) The score function J : (0, 1) — > R is not constant, and the difference J\ — J2 between two 
right-continuous and square integrablc non-decreasing monotone functions J\ and J2 ■ (0, 1) — ¥ R. 
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Strongly unimodal densities / trivially satisfy that assumption. 6 Except for the Gaussian one, stable 
densities (2.4) are not strongly unimodal. However, u M- ipf(F~ 1 (u)) being bounded (in absolute value) and 
continuously diffcrcntiablc, with a derivative changing signs exactly twice, it has bounded variation, hence 
can be expressed as the difference between two monotone increasing functions; iff therefore also can. 

The following result summarizes the asymptotic properties of the rank-based statistics (2.5); see the 
Appendix for a proof. 

Proposition 2.2 Let Assumptions (Al), (A2) and (B) hold. Then, 

(i) letting Aj l \/3) := n~^K.^'J27=i^(^(^i i^))) c i ' where G stands for the distribution function 

(n) 

associated with a density g G T , we have, under P g a p, asn->oo, 

A { J n \l3)-A { : j l} (l3) = o P (l). (2.6) 

Hence, for J(u) = iff(F^ 1 (u)) with f 6 T, Aj"^(/3) is asymptotically equivalent, 7 under ~P^p\p, 
to A^GS); 

(ii) under ^ (g <E T), Aj™^(/8) is asymptotically normal with mean zero and covariance matrix 
J(J)I K , where J(J) := £ J 2 (u)du; 

(Hi) under Y g a p +v { n ) T (g £ T), A j"' '(/?) is asymptotically normal with mean J^(J,g)r and covariance 
matrix ^T(J)1k , where 

J(J,g):= [ J{u) Va (G-\u))Au- (2.7) 
Jo 

(iv) A satisfies the asymptotic linearity property 

A^OS + i/CM"))- Af{p) = -J{J,g)T^+o?(\) (2.8) 

under p' n ' _ with o£ J, as n — y oo. 
gap J ' 

Under the conditions of Proposition 2.1, the Le Cam one-step methodology requires the existence of a 
preliminary root-n consistent estimator ^( n ) of /3. The LAD estimator 3lad of /3, which we are considering 
in the sequel, is one possibility, but any other estimator enjoying root-n consistency under the whole class 
of stable densities would be an equally valid candidate. 

The LAD estimator (^ladi/Slad)' °f ( a iP'Y i s obtained by minimizing the Li-objective function 

n 

Rad Aad)' : = argmin (a ^ )eRK+ i ^ \Zf>(fi)\. 

i=l 

In this context, however, a needs not be estimated, as ranks are insensitive to location shift; we therefore 

*,(«) . . .... 

concentrate on p LAD . In order to control for the uniformity of local behaviors, a discrctized version 

of /8[™D should be considered in theoretical asymptotic statements. The discretization trick, which is due 
to Le Cam, is quite standard in the context of one-step estimation. While retaining root-n consistency, 



6 A density / is called strongly unimodal if f 1 / 2 is differentiable in quadratic mean and </?/ is monotone increasing; Gaussian, 
logistic and double exponential densities are strongly unimodal. 

7 Sincc central sequences are only defined up to op(l) terms, A j Q8) thus is a rank-based version of the central se- 
quence A J 1 (/S). 



discrctized estimators indeed enjoy the important property of asymptotic local discreteness, that is, they 
only take a finite number of distinct values, as n — > oo, in /3-ccntcred balls with 0(n _1//2 ) radius. In fixed-n 
practice, however, such discretizations are irrelevant (the discretization constant can be chosen arbitrarily 
large). For the sake of simplicity, we will henceforth tacitly assume that j8i" D! in asymptotic statements, 
has been adequately discretized. 

Were l 7 _1 (J, g) a known quantity, the one-step R-estimator of would take (since the asymptotic 
variance of A j is proportional to an identity matrix) the following very simple form: 

If :=^+^ n) J- 1 (J,9)Af0tl)- (2.9) 

It readily follows from (2.8) (as well as from standard results on one-step estimation: see, e.g., Proposition 1 
in Chapter 6 of Le Cam and Yang (2000)) that 

v-\n){~pf-fi) = J-\j,g)Af(l3)+o P (l), 

hence, that u~ 1 (n)(0 ( j l) -0) is asymptotically Af(0, ( J( J)/J 2 (J,g))I K ) under p (g 6 F). This in turn 

implies that i/ -1 (n)(|^ - 0), for J(u) = ip f (F- 1 (u)), is asymptotically 7V(0, J' 1 (J)l K ) under 
that is, reaches parametric efficiency at correctly specified density f = g. 

Unfortunately, the scalar cross-information quantity J (J, g) is not known — a phenomenon that does not 
appear in the usual one-step method, based on the "parametric central sequence" associated with some 
correctly identified density / = g. Under definition (2.9), J 1 therefore is not a genuine estimator. That 
cross-information quantity 3 (J, g) thus has to be consistently estimated. To obtain such a consistent esti- 
mator, we adopt here the idea first developed in Hallin et al. (2006) and generalized in Cassart et al. (2010). 

For all v > 0, define 0^ n \v) := ^["d + v^vA j 0¥at>), and consider the scalar product 

Proposition 2.2, the consistency and local asymptotic discreteness of ^[™' D , and the definition of 0^ n \v) 
entail that, under P„} „ with g € T, 

hW(v) = (A("»(^ D ))' (A W(f A l ) - J(J )? ) ^(kW)" 1 ^^^) - jSW )) +o P (l) 
= (AW^))'(AW^)-JW#AW0W)) +o p (1) 

= (1- J(J,g)v) /iW(0)+op(1) (2.10) 

for any v > 0; this provides the intuition for taking the solution of hiv) — as an estimation of {J r { J, g))^ 1 ■ 
And, provided that h^(Q) is not op(l), a consistent estimator of (J(J,g)) 1 indeed would be 

v {n) := inf{v > : h {n \v) < 0}. 

More precisely, consider a discretization of the positive half-line, with vt := £/c, I <G N, c > a (typically, 
large) discretizing constant, the value of which, however, plays no role in asymptotic statements. Putting 

v ( l l) := min{l such that ^(w^) < 0} and vf := i/ n) + -, (2.11) 



consider the linear interpolation 



1> V ; := w_ 1 r^r- + W.- 
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(2.12) 



It follows from Proposition 2.1 in Cassart et al. (2010) that, unless h^(0) is o P (l), J(J,g) := (fiW) -1 
provides a consistent estimator of the cross- information quantity J(J,g). Our one-step R-cstimator then is 
defined as 

g W := ^ (») ( J} g)) = + „(») J, ff) A W 

Now, if J is such that A^'^aI) = °p(1)i that is, if the Laplace or double-exponential score function 
u M> Jl(u) '■= v / 2sign(u — 1/2), is considered, we have (see Proposition 2.4) fij = ^["d + op(?i -1 / 2 ) and 

fi = /8lad + op(n~ 1 / 2 ), so that our estimator coincides, asymptotically, with the LAD estimator. 

The following result (see the Appendix for a proof) summarizes the asymptotic properties of fi j . 

Proposition 2.3 Let Assumptions (Al), (A2) and (B) hold. Then, n 1//2 (/8 J' —ft) is asymptotically normal 
with mean zero and covariance matrix ( l 7(J)/ l J 2 (J, p))K 2 under P^"j ^ with g € T. Therefore, letting 
J(u) = ipf(F~ 1 (u)), j3 j achieves the parametric efficiency bound under Pj^p- 

In view of Proposition 2.3, the asymptotic relative efficiencies of our R-estimators clearly coincide with 
those of the corresponding tests developed in Hallin et al. (2010). More precisely, we have that 

ABE fl (Ji/J 2 ) = J 2 (Ji,g)J(J 2 )/J 2 (J2, g)J(Ji), (2.13) 

where ARE g (Ji/J2) denotes the asymptotic relative efficiency, under density g, of the R-estimator /3 j™\ 

based on the score-generating function J\, with respect to the R-estimator f$ yj , based on the score- 
generating function J 2 . 



Tabic 1: AREs of R-estimators with respect to LAD estimators 



Estimators 




Underlying 


I stable density 






a = 2; b = 


a = 1.8; b = 


a = 1.8; b = 0.5 


a = 0.5; b = 0.5 


w i \ 


1.4999 


1.3888 


1.3984 


1.7776 


~ J v dW 


1.5708 


1.3056 


1.3285 


1.251 




0.6759 


0.7880 


0.7769 


2.007 


J 1.8;0 


1.4459 


1.4183 


1.4222 


1.6453 


■'1.8;. 5 


1.4452 


1.3969 


1.4459 


1.4432 




0.0925 


0.1099 


0.1175 


21.2364 



AREs for R-estimators based on various scores with respect to the LAD estimator. 
Columns correspond to the (stable) densities under which AREs are computed, rows 
to the scores considered: Wilcoxon (Jw), van der Waerden (J v dw), Cauchy (Jc), and 
three (5 = 0, 7=1) stable scores (J a -b)', recall that the R-estimator based on Laplace 
scores asymptotically coincides with the LAD estimator (see Proposition 2.4). 



Traditional scores (such as the van der Waerden, Wilcoxon and Laplace ones) are associated with some 
classical light-tailed densities (such as the normal, logistic and double-exponential), leading to the score- 
generating functions 

^vdw(w) = ®~ 1 (u) 7 Jw{u) = —t=(2m — 1), and Jl{u) = v / 2sign(u — 1/2), 
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Wilcoxon/LAD 



Cauchy/LAD 



VdW/LAD 




0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0 



Figure 1: AREs of R-estimators based on Wilcoxon, Cauchy and van der Wacrden scores, with respect to the 
LAD estimator, as a function of a and for various values of b. 



respectively, where $ denotes, as usual, the standard normal distribution function. The resulting R-esti- 
mators are reaching parametric efficiency under Gaussian, logistic and double-exponential densities, respec- 
tively. Stable scores, of the form Jg(x) = — fg(F e ~ 1 (x))/fg(F ~ 1 (x)), where fg is some stable density, also 
can be considered, not under closed form, though; we refer to Appendix B of Hallin et al. (2010), where 
rank tests based on such stable scores are discussed, for details. Table 1 and Figures 1 and 2 provide nu- 
merical values of AREs in (2.13) for various estimators and underlying stable densities. Interestingly, the 
R-estimators based on the stable scores for tail index 1.4 uniformly dominate, irrespective of the asymmetry 
parameter b, the LAD estimator for all values of a G [1,2]. Their AREs with respect to LAD estimators 
moreover culminates in the vicinity of a = 1.8, a value which is generally recognized as a reasonable tail 
index for financial data. 8 

To conclude this section, the following result establishes the asymptotic equivalence between the LAD 
estimator and the Laplace R-estimator (based on the score function Jl); see the Appendix for a proof. 

Proposition 2.4 Let Assumptions (Al) and (A2) hold. Then, the difference /^j™"* — fitXn is op(n~ 1//2 ) 
as n — > oo under ^ for any g € T such that g is strictly positive at the median G _1 (^). 

As a direct consequence, the ARE (under ^ with g g F) of any estimator f}^ ' with respect to ^["d 
is equal to the ARE of f3^ "* with respect to /? ^ . 

3. Finite-sample performance. 

This section is devoted to a simulation study of the finite-sample performances of the various R-cstimators 
described in the previous sections and some of their competitors, in order to check whether these perfor- 
mances are are in line with the ARE results of Table 1 . 

We generated M = 1000 samples from two multiple regression models, 

Y} 1} =c ll +c a + € l , i=l,...,n=100, (3.14) 



8 Dominicy and Veredas (2010) found that the estimated a for 22 major worldwide market indexes (nine years of daily 
returns) ranges between 1.55 to 1.90, with an average of 1.75. Similar values have been obtained for other financial assets, e.g. 
in Mittnik et al. (2000) or Deo (2002). 
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with two regressors, and 

Y> = Cji + c l2 + c l3 + c i4 + e i; i = 1, . . . ,n = 100, (3.15) 

with four regressors, both with alpha-stable i.i.d. ej's. The regression constants cy (the same ones across 
the 1000 replications) were drawn (independently) from the uniform distribution on [— 1,1] 2 and [— 1, l] 4 , 
respectively. Letting Ik ■= (1, 1, . . . , 1) G M. K , the true values of the regression parameters arc thus 8 = I2 
in model (3.14) and 8 — I4 in model (3.15). 

Denoting by 8^ n \j) = {p\ (j), ■ ■ ■ , f^x (]))' U = 1,...,M; K = 2 or 4 depending on the model) an 
estimator 8 computed from the jth replication, the empirical bias and empirical mean square error for 
the first component of 8^ are 

M M 
BIAS^W)^^^"^')-!), ^d MSE ; (/3 ( " ) ) := -^^(^"'(j) - l) 2 , 

respectively; models (3.14) and (3.15) being perfectly symmetric, efficiency comparisons can be based on 

~(n) 

that first component only. These quantities were computed for the least squares /3 LS and the LAD esti- 
mators /^lad * the one-step versions 8 ^ , 8 j^J and /3 ^ of the van der Waerden, Wilcoxon and Laplace 
estimators, and the one-step R-cstimators 8 j b associated with the stable scores with tail index a and 
skewness parameter b (a = 1.8/6 = 0; a = 1.8/6 = 0.5; a = 1.2/6 = 0; a = 1.2/6 = 0.5; a = 0.5/6 = 0.5), 
respectively For the sake of comparison, wc also computed the bias and mean square errors associated 
with the Argmin (Hodgcs-Lchmann 1963; Jurcckova 1971) versions 8 {n) ,8 {n) and 8 (n) of the 

v ' ~HL;W ~HL;vdW ~HL;1.8/0 

Wilcoxon, van dcr Waerden, and stable score (a = 1.8/6 = 0) R-estimators; the latter were computed via 
the Nelder-Mead (1965) method. 

Results are collected in Table 3 for model (3.14) and Table 3 for model (3.15), and confirm the theoretical 
findings of the previous sections. Least squares behave quite poorly, and fail miserably as the tail index 
decreases, while least absolute deviations maintain an overall good performance. The empirical performances 
of R-estimators are consistent with theoretical ARE rankings. Depending on the scores and the actual 
underlying tail index and skewness parameter, R-estimators may or may not improve on least absolute 
deviations. Stable score-based R-cstimators, as a rule, outperform least absolute deviations, as expected, 
under correctly specified values of the tail index. 

It is worth noting that one-step R-estimators arc doing better than their Hodgcs-Lchmann counterparts in 
tmodcl (3.15), that is, when the parameter is of dimension four. This is most probably due to computational 
problems related with the Argmin approach in higher dimensions; such problems do not occur in the one- 
step approach. Further evidence of this phenomenon is provided in Table 3, where we report results for 
the one-step and Hodges-Lehmann versions of the van der Waerden R-estimator in regression models of the 
form 

Y} 2) =c a + c a + ... + c i K + £*, i = l,...,n= 100, (3.16) 

with K regressors, K = 6, 10, 15 (same number of replications; regression constants uniform over [—1, 1] A ). 
Irrespective of the underlying stable density, the superiority of the one-step version quite significantly in- 
creases with K. 

4. Conclusion. 

Stable densities constitute a broad and flexible class of probability density functions, allowing for asym- 
metry and heavy tails. Their theoretical properties make them quite appealing in a variety of applications, 
including econometric and financial ones. Traditional inference methods, however, in general are not valid 
in models involving stable error: classical tests no longer satisfy nominal probability level constraints, and 
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Tabic 2: Empirical bias and mean square error for various estimators of (i in model (3.14) 



Estimator 



Underlying stable density (a/6) 



2/6 = a = 1.8/6 = a = l. 



0.5 a = 1.2/6 = a = 1.2/6 = 0.5 a = 0.5/^ = 0.5 



(Bias) 
(MSE) 



.00193 
.06770 



-.00134 
.19459 



.01385 
.27336 



.18680 
124.46 



-.19255 
88.070 



740527.6 
5.3560e+14 



*(») 

Plad 



(Bias) 
(MSE) 



.00167 
.10674 



-.00087 
.10411 



.00502 
.11638 



.02995 
.11560 



.00646 
.13396 



-.02438 
.23233 



(«) 

dW 



(Bias) 
(MSE) 



.00256 
.06878 



-.00136 
.07694 



.00694 
.08545 



.03376 
.15165 



-.00243 
.14499 



.00745 
.49418 



(Bias) 
(MSE) 



.00076 
.07234 



.00015 
.07454 



.00920 
.08366 



.02957 
.12060 



-.00147 
.12219 



-.00165 
.29830 



(Bias) 
(MSE) 



.00167 
.10674 



-.00087 
.10411 



.00502 
.11638 



.02995 
.11560 



.00646 
.13396 



-.02438 
.23232 



~ J l.S/0 



(Bias) 
(MSE) 



.00250 
.07088 



.00063 
.07457 



.00883 
.08310 



.03046 
.12976 



.00068 
.12820 



.00267 
.36304 



(Bias) 
(MSE) 



.00187 
.07104 



-.00119 
.07683 



.01057 
.08139 



.03284 
.13562 



-.00037 
.12398 



.00284 
.34625 



(Bias) 
(MSE) 



.00424 
.11613 



.00353 
.09812 



.01373 
.11040 



.02155 
.09641 



-.00363 
.10971 



.01652 
.17458 



C~ J 1.2/. 



(Bias) 
(MSE) 



.00670 
.11416 



-.00418 
.10382 



.01609 
.10822 



.02735 
.11455 



.00310 
.08917 



-.00199 
.11282 



(Bias) 
(MSE) 



.01070 
.22575 



.03350 
.28311 



.00357 
.24386 



.04768 
.35926 



-.01671 
.18999 



.00466 
.12103 



! («) 

- HL; vdW 



(Bias) 
(MSE) 



-.01668 
.07936 



-.01040 
.08958 



-.00253 
.09508 



.04306 
.20227 



-.01664 
.20441 



.11740 
1.1934 



, (n) 



(Bias) 
(MSE) 



-.00672 
.08225 



-.02019 
.09071 



-.01113 
.09702 



-.01052 
.16290 



-.03408 
.14918 



-.24449 
.82852 



! (») 

- HL; 1.8/0 



(Bias) 
(MSE) 



-.02274 
.09066 



-.02834 
.10291 



-.01923 
.10488 



-.01504 
.18247 



-.05129 
.19072 



.24827 
.96871 



s») ; — ns] 

of the least square p LS , the LAD p LAD and various rank-based estimators computed 



Empirical 
from 1000 



bias and MSE 

replications of model (3.14) with sample size n=100, under various stable error distributions 



12 



Tabic 3: Empirical bias and mean square error for various estimators of /? in model (3.15) 
Estimator Underlying stable density (a/6) 



a = 2/6 = a = 1.8/6 = 



1.8/6 = 0.5 



1.2/6 = a = 1.2/6 = 0.5 



0.5/6 = 0.5 



(Bias) 
(MSE) 



.00314 
.06339 



.01367 
.30161 



-.01945 
.12752 



-4.09468 
15818.91 



-.09272 
39.45292 



-47944.35 
1.23211e+13 



*(») 

Plad 



(Bias) 
(MSE) 



.00693 
.09995 



.00880 
.09992 



-.00774 
.09548 



-.00652 
.08495 



.00352 
.09984 



-.00746 
.21871 



(«) 

dW 



(Bias) 
(MSE) 



.00378 
.06463 



.00638 
.06964 



-.01177 
.07238 



-.00763 
.11369 



-.01262 
.11015 



-.01902 
.35648 



(Bias) 
(MSE) 



.00542 
.06811 



.00579 
.06847 



-.01236 
.06988 



-.00624 
.09038 



-.00774 
.09127 



-.01330 
.22657 



(Bias) 
(MSE) 



.00693 
.09995 



.00880 
.09992 



-.00774 
.09548 



-.00652 
.08495 



.00352 
.09984 



-.00746 
.21871 



(") 

~ J l.S/0 



(Bias) 
(MSE) 



.00499 
.06755 



.00531 
.06735 



-.01221 
.07021 



-.00445 
.09908 



-.00980 
.09562 



-.01629 
.27044 



(Bias) 
(MSE) 



.00339 
.06686 



.00526 
.06914 



-.01109 
.06977 



-.00438 
.10095 



-.01151 
.09397 



-.01722 
.25358 



(Bias) 
(MSE) 



.00802 
.10763 



.00608 
.09229 



-.01297 
.08986 



.00682 
.07061 



.00404 
.08406 



.00226 
.13542 



C~ J 1.2/. 



(Bias) 
(MSE) 



.00291 
.10332 



.00024 
.09233 



-.01401 
.08567 



.00396 
.09036 



-.00231 
.07037 



-.00573 
.07636 



(Bias) 
(MSE) 



.03400 
.30150 



.03653 
.35030 



-.02823 
.28818 



-.05925 
.43049 



-.00469 
.18807 



-.01970 
.19423 



! («) 

- HL; vdW 



(Bias) 
(MSE) 



.00401 
.06513 



.00634 
.06968 



-.01208 
.07266 



-.00704 
.11310 



-.01234 
.10956 



.02138 
.38167 



, (n) 



(Bias) 
(MSE) 



.00513 
.06854 



.00623 
.06855 



-.01285 
.07006 



-.00547 
.09010 



-.00755 
0.09100 



.01470 
.23734 



! (») 

- HL; 1.8/0 



(Bias) 
(MSE) 



.00494 
.06783 



.00582 
.06753 



-.01245 
.07037 



-.00396 
.09854 



-.01081 
.09594 



.01793 
.28729 



n^i : — ns] 

Empirical bias and MSE of the least square p LS , the LAD p LAD and various rank-based estimators computed 
from 1000 replications of model (3.15) with sample size n=100, under various stable error distributions. 
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Table 4: One-step R-estimation versus Argmin 



Estimator Underlying stable density (a/6) 

a = 2/6 = a = 1.8/6 = a = 1.8/6 = 0.5 a = 1.2/6 = a = 1.2/6 = 0.5 a = 0.5/6 = 0.5 

K = 6 



(Bias) 


-.01991 


-.00485 


.01084 


-.01890 


.02246 


.00162 


(MSE) 


.07707 


.08821 


.08935 


.16485 


.15258 


.61554 


(Bias) 


-.19519 


-.19834 


-.19202 


-.36809 


-.30435 


-.59222 


(MSE) 


.24257 


.27483 


.27461 


.58981 


.52245 


2.51344 



K = 10 

p'ZL ( Bias ) 

(MSE) 

P (n) (Bias) 

~ HL; vdW v ' 

(MSE) 



-.00877 .00607 

.07834 .09133 

-.91080 -.89626 

1.04321 1.07289 



.00187 -.00807 

.08641 .16835 

-.92196 -1.00979 

1.09949 1.50269 



-.01376 .06003 

.15545 1.4346 

-.99976 -.97662 

1.43327 3.23870 



K = 15 



.(») 

- ^vdW 



[(«) 



(Bias) 
(MSE) 

(Bias) 
(MSE) 



-.00374 
.08894 

-1.07573 
1.19685 



-.01421 
.10969 

-1.11915 
1.33319 



-.00575 
.10539 

-1.11057 
1.32890 



.02479 
.20918 

-1.23107 
1.91879 



0.00271 
.19621 

-1.21492 
1.88120 



.01123 
2.00335 

-1.31910 
4.32374 



Empirical bias and MSE of the one-step and Argmin versions f} 

computed from 1000 replications of model (3.16) with K — 6 
distributions. 



and B {n> of the van der Waerden R-estimator 

~ J vdW ~HL;vdW 



10, 15, sample size n=100 and various stable error 
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estimators, as a rule, are rate-suboptimal. On the other hand, due to the absence of closed-form likeli- 
hoods, theoretical optimality results are not easily derived. And, still for the same reason, their practical 
implementation is all but straightforward. 

In the particular case of linear models with stable errors (with unspecified tail index a and skewness pa- 
rameter 6), Hallin et al. (2010) show how rank-based methods provide a powerful and convenient solution to 
testing problems. In order to do so, they first establish the local asymptotically normal nature (ULAN, with 
root-n contiguity rates) of linear model experiments with stable errors. In this paper, we extend their ap- 
proach to estimation problems. More particularly, taking full advantage of the ULAN property, we construct 
one-step R-estimators for the regression parameter /?. Those estimators are root-n consistent and asymp- 
totically normal, irrespective of the underlying stable density, and their asymptotic covariance matrices are 
obtained as a by-product of the one-step procedure. Using numerical results derived in Hallin et al. (2010), 
we moreover show how to construct the R-estimators associated with stable scores, achieving parametric 
optimality at prespecificd values of a and b. 

A thorough Monte Carlo study confirms the excellent finite-sample performances of our one-step 
R-estimators, which are shown to outperform not only the traditional OLS and LAD estimator, but also 
their Argmin or Hodges-Lehmann counterparts. 

5. Appendix. 

Proof of Proposition 2.2. Point (i) is a direct consequence of the Hajck projection theorem. Points (ii) 
and (iii) follow from point (i), the central limit theorem and the Le Cam's Third Lemma. As for point (iv), 
Theorem 3.1 in Jurcckova (1969) applies. □ 

Proof of Proposition 2.3. In view of (2.9), we have that 



Proof of Proposition 2.4. Without loss of generality, we assume that the Cj's have median zero. In 

this proof, we show that ra 1 / 2 (j8 L ™ D — /?) = n l l 2 (fi^ — /?) + op(l). From the proof of Theorem 4.1 in 

Koenker (2005) (see also Koenker and Basset 1978), we have that (least absolute deviation estimation is 
equivalent to median regression hence quantile regression with quantile of order r = 1/2) 




with g g J-, 



(5.17) 



(5.19) 



(5.18) 



□ 




1/2 



n 



(5.20) 



under P 



(») 



. Now, since Jl(u) 



\/2sign(w — 1/2), we have that 




(5.21) 
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Using (5.21). (5.19) in the proof of Proposition 2.3 and point (i) of Proposition 2.2, we obtain that 



n^i&M-fi) = K^J-\j L ,g)A^(P) + o P (l) 

= K^J- 1 (J L ,g)A { ; L } (/3) + o P (l) 

~ i/ 9 n / i \ 



; = 1 

„-l/2 



S V 2 

n 

")K(""^sign(e 4 )c 1 + 0p (l) 



2j(0) 

which, in view of (5.20), completes the proof. □ 
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